智能论文笔记

Mesh Neural Networks for SE(3)-Equivariant Hemodynamics Estimation on the Artery Wall

Julian Suk , Pim de Haan , Phillip Lippe , Christoph Brune , Jelmer M. Wolterink

分类：机器学习 | 计算机视觉

2022-12-09

Computational fluid dynamics (CFD) is a valuable asset for patient-specific cardiovascular-disease diagnosis and prognosis, but its high computational demands hamper its adoption in practice. Machine-learning methods that estimate blood flow in individual patients could accelerate or replace CFD simulation to overcome these limitations. In this work, we consider the estimation of vector-valued quantities on the wall of three-dimensional geometric artery models. We employ group-equivariant graph convolution in an end-to-end SE(3)-equivariant neural network that operates directly on triangular surface meshes and makes efficient use of training data. We run experiments on a large dataset of synthetic coronary arteries and find that our method estimates directional wall shear stress (WSS) with an approximation error of 7.6% and normalised mean absolute error (NMAE) of 0.4% while up to two orders of magnitude faster than CFD. Furthermore, we show that our method is powerful enough to accurately predict transient, vector-valued WSS over the cardiac cycle while conditioned on a range of different inflow boundary conditions. These results demonstrate the potential of our proposed method as a plugin replacement for CFD in the personalised prediction of hemodynamic vector and scalar fields.

translated by 谷歌翻译

Implicit Neural Representations for Generative Modeling of Living Cell Shapes

David Wiesner , Julian Suk , Sven Dummer , David Svoboda , Jelmer M. Wolterink

分类：计算机视觉 | 机器学习

2022-07-13

允许合成现实细胞形状的方法可以帮助生成训练数据集，以改善生物医学图像中的细胞跟踪和分割。细胞形状合成的深层生成模型需要对细胞形状进行轻巧和柔性表示。但是，通常使用体素的表示不适合高分辨率形状合成，而多边形网格在建模拓扑变化（例如细胞生长或有丝分裂）时具有局限性。在这项工作中，我们建议使用符号距离功能（SDF）的级别集来表示细胞形状。我们将神经网络优化为3D+时域中任何点的SDF值的隐式神经表示。该模型以潜在代码为条件，从而允许合成新的和看不见的形状序列。我们在生长和分裂的秀丽隐杆线虫细胞上进行定量和质量验证方法，并具有生长的复杂丝虫突起的肺癌细胞。我们的结果表明，合成细胞的形状描述符类似于真实细胞的形状，并且我们的模型能够在3D+时间内生成复杂细胞形状的拓扑合理序列。

translated by 谷歌翻译

Mesh convolutional neural networks for wall shear stress estimation in 3D artery models

Julian Suk , Pim de Haan , Phillip Lippe , Christoph Brune , Jelmer M. Wolterink

分类：机器学习 | 计算机视觉

2021-09-10

计算流体动力学（CFD）是一种有价值的工具，用于动脉中血流动力学的个性化，非侵入性评估，但其复杂性和耗时的大自然在实践中禁止大规模使用。最近，已经研究了利用深度学习进行CFD参数的快速估计，如表面网格上的壁剪切应力（WSS）。然而，现有方法通常取决于表面网格的手工制作的重新参数化以匹配卷积神经网络架构。在这项工作中，我们建议使用Mesh卷积神经网络，该网状神经网络直接在CFD中使用的相同的有限元表面网格操作。我们在使用从CFD模拟中获得的地面真理培训并在两种合成冠状动脉模型的两种数据集上培训和评估我们的方法。我们表明我们灵活的深度学习模型可以准确地预测该表面网上的3D WSS矢量。我们的方法在少于5分钟内处理新网格，始终如一地实现$ \ LEQ $ 1.6 [％]的标准化平均值误差，并且在保持测试集中的90.5 [％]中位近似精度为90.5 [％]的峰值，比较以前发表的工作。这证明了CFD代理建模的可行性，使用网状卷积神经网络进行动脉模型中的血流动力学参数估计。

translated by 谷歌翻译

Periocular Biometrics: A Modality for Unconstrained Scenarios

Fernando Alonso-Fernandez , Josef Bigun , Julian Fierrez , Naser Damer , Hugo Proença , Arun Ross

分类：计算机视觉

2022-12-28

Periocular refers to the region of the face that surrounds the eye socket. This is a feature-rich area that can be used by itself to determine the identity of an individual. It is especially useful when the iris or the face cannot be reliably acquired. This can be the case of unconstrained or uncooperative scenarios, where the face may appear partially occluded, or the subject-to-camera distance may be high. However, it has received revived attention during the pandemic due to masked faces, leaving the ocular region as the only visible facial area, even in controlled scenarios. This paper discusses the state-of-the-art of periocular biometrics, giving an overall framework of its most significant research aspects.

translated by 谷歌翻译

Push-the-Boundary: Boundary-aware Feature Propagation for Semantic Segmentation of 3D Point Clouds

Shenglan Du , Nail Ibrahimli , Jantien Stoter , Julian Kooij , Liangliang Nan

分类：计算机视觉

2022-12-23

Feedforward fully convolutional neural networks currently dominate in semantic segmentation of 3D point clouds. Despite their great success, they suffer from the loss of local information at low-level layers, posing significant challenges to accurate scene segmentation and precise object boundary delineation. Prior works either address this issue by post-processing or jointly learn object boundaries to implicitly improve feature encoding of the networks. These approaches often require additional modules which are difficult to integrate into the original architecture. To improve the segmentation near object boundaries, we propose a boundary-aware feature propagation mechanism. This mechanism is achieved by exploiting a multi-task learning framework that aims to explicitly guide the boundaries to their original locations. With one shared encoder, our network outputs (i) boundary localization, (ii) prediction of directions pointing to the object's interior, and (iii) semantic segmentation, in three parallel streams. The predicted boundaries and directions are fused to propagate the learned features to refine the segmentation. We conduct extensive experiments on the S3DIS and SensatUrban datasets against various baseline methods, demonstrating that our proposed approach yields consistent improvements by reducing boundary errors. Our code is available at https://github.com/shenglandu/PushBoundary.

translated by 谷歌翻译

A C++ Implementation of a Cartesian Impedance Controller for Robotic Manipulators

Matthias Mayr , Julian M. Salt-Ducaju

分类：机器人

2022-12-21

Cartesian impedance control is a type of motion control strategy for robots that improves safety in partially unknown environments by achieving a compliant behavior of the robot with respect to its external forces. This compliant robot behavior has the added benefit of allowing physical human guidance of the robot. In this paper, we propose a C++ implementation of compliance control valid for any torque-commanded robotic manipulator. The proposed controller implements Cartesian impedance control to track a desired end-effector pose. Additionally, joint impedance is projected in the nullspace of the Cartesian robot motion to track a desired robot joint configuration without perturbing the Cartesian motion of the robot. The proposed implementation also allows the robot to apply desired forces and torques to its environment. Several safety features such as filtering, rate limiting, and saturation are included in the proposed implementation. The core functionalities are in a re-usable base library and a Robot Operating System (ROS) ros_control integration is provided on top of that. The implementation was tested with the KUKA LBR iiwa robot and the Franka Emika Robot (Panda) both in simulation and with the physical robots.

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

DePlot: One-shot visual language reasoning by plot-to-table translation

Fangyu Liu , Julian Martin Eisenschlos , Francesco Piccinno , Syrine Krichene , Chenxi Pang , Kenton Lee , Mandar Joshi , Wenhu Chen , Nigel Collier , Yasemin Altun

分类：自然语言处理 | 人工智能 | 计算机视觉

2022-12-20

Visual language such as charts and plots is ubiquitous in the human world. Comprehending plots and charts requires strong reasoning skills. Prior state-of-the-art (SOTA) models require at least tens of thousands of training examples and their reasoning capabilities are still much limited, especially on complex human-written queries. This paper presents the first one-shot solution to visual language reasoning. We decompose the challenge of visual language reasoning into two steps: (1) plot-to-text translation, and (2) reasoning over the translated text. The key in this method is a modality conversion module, named as DePlot, which translates the image of a plot or chart to a linearized table. The output of DePlot can then be directly used to prompt a pretrained large language model (LLM), exploiting the few-shot reasoning capabilities of LLMs. To obtain DePlot, we standardize the plot-to-table task by establishing unified task formats and metrics, and train DePlot end-to-end on this task. DePlot can then be used off-the-shelf together with LLMs in a plug-and-play fashion. Compared with a SOTA model finetuned on more than >28k data points, DePlot+LLM with just one-shot prompting achieves a 24.0% improvement over finetuned SOTA on human-written queries from the task of chart QA.

translated by 谷歌翻译

Synthetic Pre-Training Tasks for Neural Machine Translation

Zexue He , Graeme Blackwood , Rameswar Panda , Julian McAuley , Rogerio Feris

分类：自然语言处理 | 人工智能

2022-12-19

Pre-training is an effective technique for ensuring robust performance on a variety of machine learning tasks. It typically depends on large-scale crawled corpora that can result in toxic or biased models. Such data can also be problematic with respect to copyright, attribution, and privacy. Pre-training with synthetic tasks and data is a promising way of alleviating such concerns since no real-world information is ingested by the model. Our goal in this paper is to understand what makes for a good pre-trained model when using synthetic resources. We answer this question in the context of neural machine translation by considering two novel approaches to translation model pre-training. Our first approach studies the effect of pre-training on obfuscated data derived from a parallel corpus by mapping words to a vocabulary of 'nonsense' tokens. Our second approach explores the effect of pre-training on procedurally generated synthetic parallel data that does not depend on any real human language corpus. Our empirical evaluation on multiple language pairs shows that, to a surprising degree, the benefits of pre-training can be realized even with obfuscated or purely synthetic parallel data. In our analysis, we consider the extent to which obfuscated and synthetic pre-training techniques can be used to mitigate the issue of hallucinated model toxicity.

translated by 谷歌翻译

MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering

Fangyu Liu , Francesco Piccinno , Syrine Krichene , Chenxi Pang , Kenton Lee , Mandar Joshi , Yasemin Altun , Nigel Collier , Julian Martin Eisenschlos

分类：自然语言处理 | 人工智能 | 计算机视觉

2022-12-19

Visual language data such as plots, charts, and infographics are ubiquitous in the human world. However, state-of-the-art vision-language models do not perform well on these data. We propose MatCha (Math reasoning and Chart derendering pretraining) to enhance visual language models' capabilities in jointly modeling charts/plots and language data. Specifically, we propose several pretraining tasks that cover plot deconstruction and numerical reasoning which are the key capabilities in visual language modeling. We perform the MatCha pretraining starting from Pix2Struct, a recently proposed image-to-text visual language model. On standard benchmarks such as PlotQA and ChartQA, the MatCha model outperforms state-of-the-art methods by as much as nearly 20%. We also examine how well MatCha pretraining transfers to domains such as screenshots, textbook diagrams, and document figures and observe overall improvement, verifying the usefulness of MatCha pretraining on broader visual language tasks.

translated by 谷歌翻译